Document information collection method and document information collection apparatus

ABSTRACT

Method and apparatus for automatically storing a referred electronic document in a database and notifying new information to a user. In response to a request issued from a document display control unit 106, a request relay unit 108 acquires a document from a document information management unit 113 and stores it in a document database 117. When a document which matches an interested item 112 is stored, an information monitor unit 111 notifies it to the user.

BACKGROUND OF THE INVENTION

The present invention relates to a method for collecting documentinformation through a communication network, and more particularly todocument information collection method and document informationcollection apparatus which allow to automatically form a database fromdocument information acquired from a plurality of information sources.

Many document management servers for electronically referring documentssuch as an on-line news service, an electronic bulletin board system anda document database have been known. Some of those systems may allowretrieval of a stored document but the retrieval is conducted for theentire data in the system. Accordingly, when a user wants to refer againa document which the user has referred in the past, it is necessary toretrieve it from the result of retrieval for the entire system or theuser is required to explicitly register the document which he hasreferred in a database which he managed by himself. When the userutilizes a plurality of document management servers, the user mustsequentially access the respective systems to conduct the retrievalbecause there is no means for retrieving in one pass from a plurality ofsystems having different document acquirement protocols or communicationprotocols.

As a method for determining a particular server in which a document ofinterest resides from a number of document management servers, it isknown to provide a server directory which is a database for determiningdocuments which each of the document management servers contains, asdescribed in "Outlook of Next-generation Information DistributionSystem", Richard Maron Stein, NIKKEI BYTE, November 1991, pp. 320-331.However, the server directory disclosed therein can manage onlyinformation on the document management servers which communicate under aparticular protocol. Further, the information on each documentmanagement server managed by the server directory is a document on whicha manager of each document management server has described aboutfeatures of the server and it is not always described from a view pointof a retrieving user and it does not always properly express thedocument managed by each server.

Some of the document management servers have a function to previouslyregister a retrieval condition and notify to the user when a newdocument which matches the condition is registered in the documentmanagement server. However, what is informed is only the newly arrivedinformation on the document management server which has registered theretrieval condition, and even if newly arrived information on otherdocument management server matches the retrieval condition, it is notinformed. Further, in order to detect the newly arrived information onthe document it is necessary to periodically access the documentmanagement server to execute the same retrieval or individually describea program to periodically acquire the document.

Recently, the number of documents which are electronically accessible aswell as the number of document management servers which providedocuments are huge.

Even if a user wants to read again a document which the user has readbefore, it is difficult to remember a location of the document. Even ifthe user wants to save the document in his hand, he is unaware of whichdocument will be required later and it is trouble-some to save alldocuments that the user reads. Some document servers do not have aretrieval function and in this case it is difficult to locate a desireddocument. Even with a server having a retrieval function, a retrievalresult is huge depending on the retrieval condition and it is difficultto find a desired one from the retrieval result. Further, since theretrieval function differs from server to server, it is required toremember a server in which a particular document resides or access manydocument management servers.

It is therefore a first object of the present invention to providedocument information collection method and document informationcollection apparatus which automatically and collectively storedocuments referred from various document management servers in adocument database which managed by a user in order to allow theretrieval.

When a document which has not been referred before but newly matches acondition is to be retrieved and it is not known which documentmanagement server the document is to be retrieved from, it is necessaryto access many document management servers. As described above, aproblem is involved in the method of providing a server directory whichis a database for the document management servers.

It is a second object of the present invention to provide documentinformation collection method and document information collectionapparatus which allows the detection of a particular document managementserver in which a document along a user's intent resides.

The document management servers and the documents in each sever areincreasing day by day and even if a document management server or newdocument which may be a new information source of interest is involved,it is difficult to find it. It is particularly difficult to find theexistence of a document management server which is a new informationsource. However, it is highly probable that a document or a documentmanagement server referred by other user who has a similar interest tothat of the user includes document information of the user's interestand those information are effective in detecting a new document orinformation source.

It is a third object of the present invention to provide documentinformation collection method and document information collectionapparatus which allow the automatic detection of information of interestto a user from documents or document management servers newly found byother users.

SUMMARY OF THE INVENTION

In accordance with the present invention, document display means forissuing a document display request and request relay means for relayingthe document display request, acquiring a document from a documentmanagement server in which the document is stored and storing thedocument in a database having document retrieval means are provided. Thedocument display means, the document management server, the documentdatabase and the request relay means are connected through acommunication network. The request relay means accepts document displayrequests from a plurality of document display means, acquires documentsfrom a plurality of document management servers and collectively storesthe acquired documents in the document database.

Further, interested item memory means for storing interested items of auser, information monitor means for monitoring a document to beregistered in the document database and information on the documentmanagement server from which the document is acquired and informationnotify means for notifying information of document itself or informationof the document management server to the user when the document whichmatches the interested item of the user is registered in the documentdatabase are provided. The information notify means may asynchronouslynotify the information to the user. The document display means sends thesame document display request as that which is directly sent to thedocument management server to the request relay means. The request relaymeans sends the same document display request to the document managementserver, receives a response from the document management server,acquires the document and stores it in the document database. At thesame time, the request relay means sends back the response received fromthe document management server to the document display means as it is.Thus, the user who issues the request through the document display meanscan structure the database of the documents which the user referred towithout extra manpower of storing the documents in the documentdatabase. Since there is no change in the request or the responsemethod, it is not necessary to modify the existing document managementserver and document display means.

The request relay means acquires documents from a plurality of documentmanagement servers having different document acquire protocols andstores them in the document database. Thus, it is possible to retrieve adocument from a plurality of information sources on one documentdatabase.

The request relay means may be shared by a plurality of users. When therequest relay means is used by the plurality of users, the informationmonitor means monitors the document database, and when a document whichmatches an interested item of a user, it notifies to the user thedocument or the information on the document management server in whichthe document is stored through the document notify means. Thus, the usermay detect the existence of the document or the document managementserver which found by other use.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a configuration of a document information collectionapparatus of the present invention,

FIG. 2 shows a system configuration which utilizes the documentinformation collection apparatus of the present invention,

FIG. 3 shows a format of a document display data table,

FIG. 4 shows a format of a server management data table,

FIG. 5 shows a format of a document information data table,

FIG. 6 shows a format of an interested item data table,

FIG. 7 shows an operation of a document display client and a documentmanagement server in a prior art method,

FIGS. 8A-8D show formats of request issued by the document displayclient to the document management server,

FIGS. 9A and 9B show format of response sent back from the documentmanagement server to the document display client,

FIG. 10 shows an operation of a request relay unit,

FIG. 11 is an operation flow chart of the request relay unit,

FIG. 12 shows types of request issued by the document display client tothe document management server and an operation thereof,

FIG. 13 shows a configuration of a database control unit,

FIGS. 14A and 14B show format of retrieval result and display thereof,respectively,

FIG. 15 shows another configuration of the request relay unit,

FIG. 16 shows a format of a document acquire request record table,

FIG. 17 shows an operation flow chart of an information monitor unit,

FIG. 18 shows a flow chart for message preparation, and

FIGS. 19A-19D show types of message and displays thereof.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

An embodiment of the present invention will be described with referenceto the accompanying drawings.

FIG. 1 shows a configuration of a document information system includinga document information collection apparatus of the present invention.Functions of respective units are now explained with reference to FIG.1.

A document display client 101, a database server 102 and a documentmanagement server 103 are connected through a communication network 115.

The document display client 101 is a computer connected to thecommunication network 115 and it operates as a user terminal. Thedocument display client 101 accepts a request from a user through aninput device 105 such as keyboard or a pointing device including amouse.

A document display control unit 106 displays guide information for thedisplay of document or a document such as a document list menu to adisplay device 104 and it is client software for the document managementserver 103. The user inputs a request to the document display controlunit 106 through the input device 105 to designate a document which theuser wants to read. Display document data 116 is used to display ascreen by the document display control unit 106 and it is data of thedocument per se or data for displaying a document list. The displaydocument data 116 will be described in detail later with reference toFIG. 3. A database control unit 118 is client software for retrievingdata from a document database 117 and registering data. A messagecontrol unit 107 controls a message asynchronously sent to the user evenwhen the user does not operate the document display client 101 and itmay use an electronic mail.

The document management server 103 stores documents, receives a requestfrom a client through the communication network 115 and sends back acorresponding document to the client. The document management server 103comprises a document information management unit 113 and document data114. The document data 114 comprises documents and managementinformation therefor. The document information management unit 113manages the document data 114 and responds to a document acquire requestfrom the document display control unit 106 on the document displayclient 101.

The database server 102 relays the request and the response between thedocument display client 101 and the document management server 103 andstores the document data in the database. Namely, the request relayserver receives the request from the document display client 101,transfers the request to the document management server 103, receivesthe response from the document management server 103, and sends back theresponse to the document display client 101. If the response is documentdata, it is stored in the database.

A configuration of the database server 102 is shown below. A requestrelay unit 108 relays the request and the response between the documentdisplay client 101 and the document management server 103 based on theinformation of server management data 119.

A format of the server management data 119 is shown in FIG. 4. Numerals410, 411, 412, 413 and 414 denote entries to the respective documentmanagement servers. Each entry comprises a document management server ID400, a protocol 401, a server name 402, a location 403, an effectiveperiod 404 and additional information 405. The document managementserver ID 400 is an identifier for uniquely identifying each entry. Theprotocol 401 is data indicating a communication protocol used by thedocument management server. The server name 402 is data indicating thename of the document management server. In the example shown in FIG. 4,the document management servers which use a protocol A include "∘server", "Δ sever" and "x server", and the document management serverswhich use a protocol C include "∘∘ server". Depending on the protocol,the server name may not be required or the server may be uniquelydetermined. In such a case, the data in the server name 402 is notnecessary. An example thereof is the document management server whichuses the protocol C designated by 414. The effective period 404 is dataindicating an effective period of data on the document database 117. Theeffective period is used when the document previously acquired from thedocument management server is to be reused. The document managementserver may be of a type in which the document once registered will neverbeen modified or of a type in which the document is periodically updatedand the lifetimes of the documents are versatile. When "-1" isdesignated by the effective period 404, it indicates the documentmanagement server of the type in which the document is never modified,and when a positive number is designated, it indicates that the documentpreviously acquired may be considered valid until the expiration of thenumber of days indicated by the numeral. The numeral need not always beequal to an interval at which the document is actually updated by thedocument management server, but in this case the data of the document onthe document management server may be different from the data of thedocument which the user acquires on the document display client 101. Inorder to avoid it, "0" may be designated for the effective period 404 sothat when the document is acquired, the document management server isalways accessed to acquire the latest document data. The additionalinformation 405 is data for describing other necessary information, andthe described data may differ depending from protocol to protocol of thedocument management server. In the example of FIG. 4, the protocol Adescribes the last access time to the document management server and theprotocol B describes a log-in user name in accessing the sever as theadditional information.

The document database 117 receives requests from the request relay unit108 and the database control unit 118 and stores the document data andprovides the retrieval function. The document database 117 comprises adatabase management unit 109 and document information data 110.

The document information data 110 is of a format shown in FIG. 5.Numerals 510, 511 and 512 denote entries for the respective documents.Each entry comprises an ID 501, a document management server ID 502, adocument ID 503, a user 504, a date 505, an attribute 506 and a text507. The ID 501 is an identifier for uniquely identifying each entry.The document management server ID 502 is data for identifying a documentmanagement server in which an original document is stored and itcorresponds to data of item 400 in FIG. 4. The document ID 503 is datafor indicating a document identifier on the document management server.When additional information such as location in the document isnecessary, additional information may be added to the document ID 503.The user 504 is a user name of a user who requests the document. Theuser 504 may not be used depending on protocol. The data 505 indicates adate on which each entry is registered. The attribute 506 is data todescribe other necessary information and the described data may differfrom protocol to protocol of the document management server. Forexample, a title name or the number of characters may be described. Thetext 507 is text data of the document per se.

The documents stored in the document database 117 may be retrieved byusing the function of the database management unit 109. As retrievalkeys, any attribute information and classification information which arepresent in a table shown in FIG. 5 may be designated. When the documentdatabase 117 has a whole text retrieval function, the retrieval may beconducted by using any word included in the text. In the whole textretrieval function, when a condition may be designated by "A and B andC" which is a logical expression of words included in the document or aplurality of words may be designated as the condition and a correlationbetween the document and those words may be calculated and they arearranged in the ascending order of correlation.

An information monitor unit 111 monitors new information on the documentdatabase 117, and when a document relating to an item which the user isinterested in is newly registered, it notifies it to the user through amessage control unit 107. Interested item data 112 is data relating tothe interested item of the user and an addressee of the message.

A format of the interested item data 112 is shown in FIG. 6. Numerals610, 611, 612 and 613 denote entries for the interested items. Eachentry comprises an ID 601, an execution date 602, notify means 603, anaddress 604, a frequency 605, a form of notification 606, a condition607 and an object 608. The ID 601 is an identifier for uniquelyidentifying each entry. The execution date 602 is date on which theinterested item was notified previously and it is updated each time thenotification is executed. The notify means 603 is data indicating meansfor notifying the interested item. As the notify means, an electronicfile or a specific file on the server may be designated. The address 604designates an address of the electronic mail when the notify means 603is the electronic mail, and a file name when the notify means 603 is thefile. When the user wants to detect the interested item, he may reviewthe content of the electronic mail or periodically refer the file in theserver. The frequency 605 indicates a frequency to notify the data whichmatches the interested item. For example, the number of times ofnotification per predetermined period such as monthly or daily isdesignated. The notification format 606 indicates a massage format. Asthe format, a conventional text format or a format for a special programmay be designated. The condition 607 designates the interested item ofthe user. For example, the entry 610 indicates the interest to"computers and new products". The object 608 designates the object ofnotification of the interested item. When the "document" is designatedas the object 608, the document which matches the interested item isnotified as the interested item data. The document management server inwhich the document which matches the interested item is notified as theinterested item data.

The message control unit 107 exchanges messages with the message controlunit on the document display client 101.

The document display client 101, the database server 102 and thedocument management server 103 may be located on the same computer or onseparate computers. As for the document database 117, it may be locatedon one computer or it may be a distributed database across a pluralityof computers.

The document display clients and the document management servers maycomprise a plurality of units, respectively. In FIG. 2, numerals 201,202, 203 and 204 denote document management servers which haveindividual document management functions. The document management servermay be of one type or a plurality of types. In FIG. 2, numerals 201 and202 denote document management servers which accept the document acquirerequests based on the common protocol A, numeral 203 denotes a documentmanagement server which accepts the document acquire request based onthe protocol B, and numeral 204 denotes a document management serverwhich accepts the document acquire request based on the protocol C.

In FIG. 2, numerals 205, 206 and 207 denote document display clients.For the document display client 205, the user uses the document displayclient for the protocol A to refer the document of the documentmanagement server a 201. For the document display client 206, the useruses the document display client for the protocol A to refer thedocument of the document management server b 202. For the documentdisplay client 207, the user uses the document display client for theprotocol C to refer the document of a document management server d 204.

Numeral 102 in FIG. 2 shows detail of the request relay unit 108 of thedatabase server 102 of FIG. 1. The request relay unit 108 can relay aplurality of protocols. A protocol A relay unit 208 relays requests andresponses between the document display client and the documentmanagement server by the protocol A. A protocol B relay unit 209 relaysrequests and responses between the document display client and thedocument management server by the protocol B. A protocol C relay unit210 relays requests and responses between the document display clientand the document management server by the protocol C. A documentregistration unit 211 receives document data from the respectiveprotocol relay units and stores the document data in the documentdatabase 117. A document retrieval unit 212 retrieves data on thedocument database 117.

Detail of function of the respective units and a flow of an operationamong the units are shown below.

First, referring to FIG. 7, an operation between the document displayclient 101 and the document management server 103 in a prior art methodis described. Numeral 701 of FIG. 7 shows a display content of thedisplay device 104 when it is connected to the "∘ server" which is oneof the document management servers. The document display control unit106 has data as shown in FIG. 3 as the display document data 116. FIG. 3shows data displaying a main screen when it is connected to the ∘server. In the present embodiment, the main screen is a document listmenu in the ∘ server. Numeral 300 in FIG. 3 indicates the number ofbytes of data 301 of the display text displayed on the screen. Numerals302, 303 and 304 describe information to transit from the main screen toa sub-screen or a screen of the document text.

For example, numeral 303 indicates that, when the display text data isdisplayed on the display device 104 for the document display controlunit 106, the readiness to the transition to the item 2 may be notifiedto the user by underscoring the bytes 22-24, that is, the words "item 2"in the present embodiment, when it is displayed, and "enable thetransition to the item 1 when the underscored portion is pointed by theinput device 105" is commanded. At the same time, as the information totransit to the item 2, the information that the item 2 is the entiredocument indicated by the "document ID 10" of the document managementserver "Δ server" is described. Numeral 304 indicates that the item 3 isthe bytes 200-300 of the document indicated by the "document ID 7" ofthe "∘ server".

When the user points the portion of the "item 3" of 701 by the inputdevice 105, the document display control unit 106 receives it as anacquire request of the item 3 and issues a request 702 to the ∘ server.Since the ∘ sever is the sever corresponding to the protocol A, therequest is of a format which is compatible to the protocol A. FIGS.8A-8D show examples of the request 702. FIG. 8A shows a general form ofrequest corresponding to the protocol A and FIG. 8B shows a specificexample for the ∘ sever. The request 702 corresponding to the protocol Acomprises a command 801, a sever name 802, an argument 1 (803) and anargument 2 (804). The command 801 designates an operation to the serversuch as "acquire document", "update document" or "display sub-menu". Inthe present embodiment, the "acquire document" 805 is designated. Theserver name 802 designates a request destination server of the documentmanagement servers corresponding to the protocol A. In the presentembodiment, the "∘ server" 806 is designated. The argument 1 (803)designates the document ID. In the present embodiment, the "document ID7" 807 is designated. The argument 2 (804) designates a portion of thedocument to be acquired. In the present embodiment, the "bytes 200-300"808 is designated. The "entire document" may be designated.

The above request is of the form compatible to the protocol A. A generalform of another request 702 corresponding to the protocol C is shown inFIG. 8C, and a specific example thereof is shown in FIG. 8D. In therequest 702 corresponding to the protocol C, only the user name and thedocument ID are designated. In the protocol C, the sever is fixed andthe location in the document cannot be designated.

The document information management unit 113 of the ∘ server analyzesthe request 702 to extract the requested data. In the presentembodiment, the "200-300 bytes of document ID 7" 703 is requested. Then,the document information control unit 113 extracts the display documentdata 116 corresponding to the request from the document data 114 andsends back a response 705 corresponding to the protocol A to therequester. An example of the response 705 corresponding to the protocolA is shown in FIGS. 9A and 9B. FIG. 9A shows a general form of therequest corresponding to the protocol A and FIG. 9B shows a specificexample thereof. The response corresponding to the protocol A comprisesa status 901, additional information 902 and display document data 116.The status indicates whether the request succeeded or failed. In thepresent embodiment, "OK" 903 indicating the success is sent back. Theadditional information indicates a size of the display document data 116in the present embodiment, and "100 bytes" 904 is sent back. The displaydocument data is data used when information is displayed on the displaydevice 104 by the document display control unit 108. In the presentembodiment, information of bytes 200-300 of the "item 3" is contained.The format of the response 705 differs from protocol to protocol as itis for the request 702. Numeral 706 of FIG. 7 shows a status of thedisplay device 104 after the document display control unit 106 hasreceived the response 705 corresponding to the request 702.

In this manner, in the prior art method, the request and the responseare exchanged by the direct communication of the document display client101 and the document management server 103.

An operation when the request relay unit 108 in accordance with thepresent invention is used is now explained. As shown in FIG. 10, therequest relay unit 108 relays the request and the response between thedocument display client 101 and the document management server 103. Inthe present embodiment, the document display client 101 and the documentmanagement server 103 communicates by using the protocol A and therequest relay unit 108 also conducts the operation corresponding to theprotocol A. Namely, the protocol A relay unit 208 in the request relayunit 108 of FIG. 2 is used.

A flow of an operation of the request relay unit 108 is now explainedwith reference to FIGS. 10 and 11. FIG. 10 shows a flow of data aroundthe request relay unit 108 and FIG. 11 shows steps of the operation ofthe request relay unit.

Step 3010! The request 702 by the protocol A is received from thedocument display control unit 106.

Step 3020! An instruction decoder 1003 (FIG. 10) of the request relayunit 108 extracts an instruction field 801 (FIGS. 8A-8D) from therequest 702.

Step 3025! The instruction decoder 1003 extracts a server name 802(FIGS. 8A-8D) from the request 702.

Step 3030! Whether the data in the instruction field 801 is "acquiredata" or not is determined. The instruction of request by the protocol Aincludes "display menu", "acquire document" and "update document" asshown in FIG. 12. The request relay unit 108 stores functions describingprocesses corresponding to the respective instructions. In the exampleshown in FIG. 12, "function 2" is executed for the "acquire document"instruction and "function 1" is executed for other instructions. Inaccordance with the respective functions, the process proceeds to a step3040 if the instruction is the "acquire document" and to a step 3032 forother case.

Step 3032! The request relay unit 108 transfers the request 702 to the"∘ server" which is the document management server corresponding to theserver name 802. The request relay unit 108 uses the server managementdata 119 to communicate with the ∘ server. In the present embodiment,the entry 410 shown in FIG. 4 corresponds thereto and it may beconnected to the document management server on the "address 1". When the∘ server receives the request, it operates in the same manner as that ofthe prior art shown in FIG. 7 to generate a response.

Step 3034! The request relay unit receives the response 705 by theprotocol A from the "∘ server" and the process proceeds to a step 3090.

Step 3040! The instruction decoder 1003 extracts the documentinformation field, that is, the argument 1 and the argument 2 from therequest 702. The argument 1 and the argument 2 correspond to 803 and 804shown in FIG. 8 and in the present embodiment, the "document ID 7" 807and "200-300" 808 correspond thereto, respectively.

Step 3050! Whether the document data corresponding to the documentinformation field acquired in the step 3040 is managed by the documentdatabase 117 in the database server 102 or not is determined.Specifically, each entry of the document information data 110 shown inFIG. 5 is examined through the database management unit 109, and whetheran entry for the bytes "200-300" of the "document ID 7" on the "∘server" of the document management server using the "protocol A" ispresent or not is determined, and if it is present, the process proceedsto a step 3052, and if it is not present, the process proceeds to a step3060. Since it is not present in the present embodiment, the processproceeds to the step 3060.

Step 3052! Whether the document data present in the document database iswithin the effective period or not is determined. Specifically, the dateon which the document data has been registered (505 in FIG. 5) isexamined, a difference from the current date is calculated and whetherthe document data is within the number of days of the effective periodshown by 404 in FIG. 4. If it is within the effective period, theprocess proceeds to a step 3054, and if it is not within the effectiveperiod, the process proceeds to a step 3060.

Step 3054! Document data relating to the requested document (thosecorresponding to the respective entries of FIG. 5) is acquired from thedocument information data 110 in the document database 117 through thedatabase management unit 109.

Step 3056! The response corresponding to the protocol A is generatedfrom the document data and the process proceeds to a step 3090.

Step 3060! The request received in the step 3010 is transferred to theserver acquired in the step 3025. In the present embodiment, it istransferred to the "∘ server" on the "address 1" by using the "protocolA". When the "∘ server" receives the request, it operates in the samemanner as that explained in the prior art method shown in FIG. 7 togenerate the response.

Step 3065! The response 705 is acquired from the document managementserver. Step 3070! Information 1002 (FIG. 10) comprising the documentattribute information (902 in FIG. 9B) and the display document data(116 in FIG. 9B) is extracted from the response 705 and it is registeredas a new entry of the document database through the database managementunit 109.

Step 3080! The registration of the new document in the document database117 is notified to the information monitor unit.

Step 3090! The response 705 is returned to the document display controlunit.

In the manner described above, the document which the user referred toon the document display client 101 can be automatically stored in thedocument database 117 without special registration operation.

Further, since there is no change in the request and the response forthe document display client 101 and the document management server 103,the same document display client may be connected to the documentmanagement server through the request relay unit, and when the documentis not stored in the database, it may be directly connected to thedocument management server. Further, the document management system ofthe existing client-server configuration may be readily built in thedocument information collection apparatus of the present invention. Whenthe document management system of the client-server configuration isnewly constructed, the document information collection apparatus of thepresent invention may be built in if the interfaces of the request andthe response are defined so that the sever which is the source ofcollection of the document information can be readily expanded.

When the document requested by the user is stored in the documentdatabase 117 and it is within the effective period, the document datastored in the document database 117 may be returned. Thus, the access tothe document database 117 is fast and the acquire time of the documentmay be shortened when the access to the document management server 103takes a long time.

The document data registered in the document database 117 in the mannerdescribed above can be retrieved through the database control unit 118on the document display client 101. The database control unit 118 mayexecute only the retrieval and the reference to the retrieval result maybe conducted by using the document display control units 106corresponding to the respective protocols or the database control unit118 may execute the retrieval and refer all document data.

A configuration when the database control unit 118 has both functions ofthe execution of retrieval and the reference to the document data isshown in FIG. 13. Numeral 1301 demotes a user interface unit whichconducts the acceptance of the request from the user and the display ofthe information. The user interface unit 1301 comprises a retrievalcondition input unit 1307 for inputting a retrieval condition of thedocument data on the document database 117 and a document display device1302 for displaying the retrieval result and the document which matchesthe retrieval condition. The retrieval condition inputted through theretrieval condition input unit 1307 is sent to the database managementunit 109 of the document database 117 through the retrieval control unit1306, the retrieval is conducted and the result is returned to theretrieval control unit 1306. The retrieval result is displayed to theuser through the document display device 1302.

FIG. 14A shows display of a list of the retrieval result. In the presentexample, there are three documents, "document 1", "document 2" and"document 3" which match the retrieval condition. As described above,the data of documents at various locations on the communication network115 utilizing various protocols are stored in the document database 117.Thus, the document which is the retrieval result may have been acquiredfrom the document management servers corresponding to various protocols.FIG. 14B shows data of the retrieval result returned from the databasemanagement unit 109 to the retrieval control unit 1306. In the exampleof FIG. 14B, the "document 1" is the entire document of the documentdesignated by the "document ID 5" of the "∘∘ server" corresponding tothe "protocol B", the "document 2" is the document designated by the"document ID 8" of the document management server corresponding to the"protocol C", and the "document 3" is the bytes "200-300" of thedocument designated by the "document ID 7" of the "∘ server"corresponding to the "protocol A". The document display device 1302receives the data shown in FIG. 14B and displays as shown in FIG. 14A.

A protocol conversion unit 1303 displays the data corresponding tovarious protocols on the document display device 1302. The protocolconversion unit 1303 comprises units for conducting processescorresponding to the respective protocols. Numeral 1304 conducts aprocess when a document on the document management server correspondingto the protocol A is requested, and numeral 1305 conducts a process whena document on the document management server corresponding to theprotocol B is requested. For example, when the "document 3" 1402 of FIG.14B is requested, the document data of the "document 3" is received fromthe database management unit 109 and it is displayed on the documentdisplay unit 1302 through the protocol A conversion unit 1304.

The database control unit 118 may share the function of the documentdisplay control unit 106 of FIG. 1. In this case, the database controlunit 118 may conduct the reference to the documents on the documentmanagement server 103 as well as the retrieval and the reference of thedocuments on the document database 117.

Since the document data on the document database 117 can be retrieved,when the user can retrieve the document which the user has previouslyreferred to, and when the request relay unit 108 is shared by aplurality of users, the user may retrieve a necessary document from thedocument which other user has referred to. Further, when there areaplurality of document management servers which are information sources,the documents on the plurality of document management servers may beretrieved in one run.

The object to be retrieved need not be the document but it may be adocument management server. In this case, the retrieval is executed witha certain retrieval condition, and if a matching document is present,the information of the document management server (the information ofFIG. 4 corresponding to 502 of FIG. 5) instead of the document name andthe number of documents found in each document management server aredisplayed in the retrieval result list. In this manner, the documentmanagement server which is the information source holding the documentwhich the user wants can be detected.

In the present embodiment, when the request relay unit 108 accepts thedocument acquire request from the document display client 101, itacquires the document from the document management server 103 andreturns it to the document display client 101 and stores it in thedocument database 117. Alternatively, the storing of the documents inthe document database 117 amy be collectively conducted at a later time.A configuration of the database server 102 when this method is adoptedis shown in FIG. 15. The request relay unit 108 comprises a requestrecord unit 1501 for recording the document acquire request and managesa document acquire request record 1502. When the request relay unit 108accepts the document acquire request from the document display client101, it acquires the document from the document management server 103and returns it to the document display client 101, and the requestrecord unit 1501 adds the document acquire request to the documentacquire request record 1502.

A data format of the document acquire request record is shown in FIG.16. Each entry of the document acquire request record comprises an ID1601, a document management server ID 1602, document ID 1603 and a user1604.

The ID 1601 is an identifier for uniquely identifying each entry. Thedocument management server ID 1602 is data for identifying the documentmanagement server in which the original document is stored and itcorresponds to the data of item 400 of FIG. 4. The document ID 1603 isdata indicating a document identifier on the document management server.When additional information such as a position in the document isneeded, additional information is added to the document ID 1603. Theuser 1604 is a user name of the user who requested the document. Theuser 1604 may not be used depending on the protocol.

The request relay unit 108 periodically refers the document acquirerequest record 1502, extracts the information of each entry, acquiresthe corresponding document from the corresponding document managementserver 103 and stores it in the document database 117. Then, theupdating of the document database 117 is notified to the informationmonitor unit 111. At the same time, the document which has newly beenadded may be notified.

When the request relay unit is used by a plurality of users, thedocuments on various document management servers accessed by varioususes are stored in the document database 117. In accordance with thepresent invention, each user may detect the information on the documentsand the document management servers accessed by other users who sharethe request relay unit.

The information monitor unit 111 monitors new information on thedocument database 117, and when a document which matches the interesteditem of the user is registered, it is notified to the user. Thisoperation is described below. The start the operation of the informationmonitor unit may be triggered by setting a timer to start it at apredetermine interval or starting the operation when the updating of thedocument database 117 is notified from the request relay unit 108.

Referring to FIG. 17, an operation when the information monitor unit 111is started is explained in detail.

Step 4010! The interested item data 112 is accessed to acquire a set ofdata of the interested items as shown in Fog. 6.

Step 4020! The counter i is set to "1".

Step 4030! The interested item data acquired in the step 4010 is checkedto determine whether the interested item is registered or not. If theinterested item is registered, the process proceeds to a step 4035, andif it is not registered, the process is terminated.

Step 4040! The previous execution date 602 and the frequency 605 of theinterested item of the ID 601 corresponding to the counter i areacquired. The subsequent steps up to a step 4080 conduct the process forthe interested item corresponding to the counter i.

Step 4045! Whether the validation on the interested item correspondingto the counter i is to be conducted or not is determined based on thecurrent date and the execution data 601 and the frequency 605 acquiredin the step 4040. For example, when the current date is 95.8.2, theentry 610 indicates the frequency of once a month and it is notexecuted. The entries 611 and 612 indicate once a day and they areexecuted. When the validation of the interested item is to be conducted,the process proceeds to a step 4050, and when it is not to be conducted,the process proceeds to a step 4090.

Step 4050! A set of documents registered on the document database 117after the previous execution date 602 are acquired through the databasemanagement unit 109.

Step 4060! The document which matches the condition 607 designated forthe interested item is retrieved from the set of documents acquired inthe step 4050 through the database management unit 109.

Step 4070! The result of the retrieval conducted in the step 4060 isconverted to a designated notification form. This step will be describedlater in detail with reference to FIG. 18.

Step 4080! The result prepared in the step 4070 is notified to adesignated address 604 by designated notify means 603.

Step 4090! The counter i is incremented by one.

Step 4100! Whether the interested item of the ID 601 corresponding tothe counter i is present in the interested item data 112 or not isdetermined, and if it is present, the process proceeds to a step 4040,and if it is not present, the process is terminated.

Referring to FIG. 18, the step 4070 is explained in detail.

Step 5010! A set of documents of the result of the retrieval conductedin the step 4060 is acquired.

Step 5020! Whether the object 608 to be retrieved which is designated asthe interested item is the "document" or the "server" is determined. Ifit is the "server", the process proceeds to a step 5030, and if it isthe "document", the process proceeds to a step 5070.

Step 5030! The document management server IDs 502 (FIG. 5) of therespective documents of the set of documents acquired in the step 5010are acquired.

Step 5040! The set of documents acquired in the step 5010 are clusterdfor each of the document management server IDs acquired in the step 5030to prepare subsets.

Step 5050! The numbers of elements of the subsets clustered for each ofthe document management server IDs in the step 5040 are counted.

Step 5060! The information on the document management serverscorresponding to the document management server IDs are acquired fromthe server management data 119 (FIG. 4).

Step 5065! The message on the document management server informationwhich is the retrieval result is prepared in the format designated bythe interested item notification form 606.

FIG. 19D shows an example of the message prepared in the step 5065 inthe format used by the database control unit 118 as described withreference to FIG. 13. The message comprises the number of bytes 1910 ofthe message text, the message text 1911 and server information of thedocument management server which is the retrieval result. As many serverinformation as the number of document management servers which hold theinterested items are present. In the example of FIG. 19D, twoinformation for the "server 1" and the "server 2" are present. Theserver information comprises information necessary for the databasecontrol unit 110 to access the document management server, a server name1912, a protocol used 1913, a location 1914 and additional information1915. FIG. 19C shows a manner of display of the message to the user bythe database control unit 118. Since the number of elements of the setof documents for each document management server ID is counted in thestep 5040, the number of documents which match the interested item foundin each document management server may be displayed so that it may beused as a degree of correlation between the documents held in eachdocument management server and the interested item of the user. When theuser designates the "server 2" through the input device 105, thedatabase control unit accesses the "Main Menu" 1915 of the "server 2".

Step 5070! The attribute of each document of the set of documentsacquired in the step 5010 is acquired from the document information data110 (FIG. 5) through the database management unit 109.

Step 5080! The message on the document management information which isthe retrieval result is prepared in the format designated by thenotification format 606 of the interested item.

FIG. 19B shows an example of the message prepared in the step 5080 inthe format used by the database control unit 118 as described withreference to FIG. 13. The message comprises the number of bytes 1900 ofa message text, the message text 1901 and document information of thedocument which is the retrieval result. As many document information asthe number of elements in the set of documents acquired in the step 5010are present. In the example of FIG. 19B, two information for the"document 1" and the "document 2" are present. The document informationcomprises information necessary for the database control unit 18 todisplay the document, a document name 1902, a protocol used 1903, aserver name 1904, a document ID 1904 and additional information 1905.FIG. 19A shows a manner of display of the message to the user by thedatabase control unit 118. When the user designates the "document 2"through the input device 105, the database control unit acquires thedocument through the request relay unit 108 and displays it.

When the notify means 603 is an electronic mail and the formatdesignated by the notification format 606 is the format as shown in FIG.3 which can be handled by the database control unit 118 as explainedwith reference to FIG. 1, the retrieval result may be referred by thedatabase control unit 118 and the document text may be directly acquiredfrom the screen of the retrieval result so that the operation from thereception of the message to the reference of the document text can besmoothly conducted.

In this manner, the user can automatically detect the documents whichmatch the previously registered interested item and the documentmanagement servers in which those documents reside.

In accordance with the document information collection method and thedocument information collection apparatus of the present invention, thedocument referred can be automatically stored in the document database.When a document of interest to the user or a document management serverin which the document of interest resides occurs, the user may receivethe notification of the information thereof.

In accordance with the present invention, other document which refersthe document designated by the user as a reference can be retrieved sothat the retrieval of the latest reference which could not be attainedin the prior art reference document retrieval method is attained.

What is claimed is:
 1. A method of controlling a relaying server in adocument information collection system having a document managementserver which collects documents therein, a terminal which allows a userto retrieve a document from said document management server and arelaying server which relays a document transferred between saiddocument management server and said terminal, the method comprising thesteps of: holding (a) delivery information having an item of informationset therein which is used to see if the document is one which a userdesires the delivery thereof, (b) a notification condition used todecide whether notification to user takes place or not, and (c) saiditem of interest information set in the delivery information, said (a),(b) and (c) being held in association with each other;holding necessaryinformation required to receive said transferred document, determine tosee if said document corresponds to said item of information set inadvance and transfer said document when said document corresponds tosaid item of information; and deciding whether said notificationcondition set is met and delivering said document to the user based onsaid delivery information when said notification condition is met.
 2. Amethod according to claim 1 further comprising the steps of:transferringa document display request to the document management server; andreturning the document and the item of information to said documentmanagement server as a reply to the document display request.
 3. Amethod according to claim 1, wherein said relaying server relays arequest to a plurality of document management servers using the sameprotocol and a document display request to a plurality of documentmanagement servers using different protocols.
 4. A document informationcollection apparatus comprising:document display means connected througha communication network; a document management server for enabling theacquisition of a document by using a predetermined document acquireprocess; request relay means for accepting a document acquire request bythe document acquire process from said document display means,transferring the document acquire request to said document managementserver to acquire the document and sending back the document to saiddocument display means; and a document database for storing the documentin response to the request from said request relay means.
 5. A documentinformation collection apparatus according to claim 4 wherein saidrequest relay means accepts requests from a plurality of different usersand a plurality of different terminals and requests by a plurality ofdifferent document acquire processes, acquires documents from thedocument management servers corresponding to the respective documentacquire processes and stores the documents in the same documentdatabase.
 6. A document information collection apparatus according toclaim 4 wherein said message comprises attribute information of thedocument, document management server information necessary to acquire atext of the document and document identifier information and isdisplayed on said document display means, and when the text of eachdocument is to be acquired, it is acquired from said document displaymeans through said request relay means.
 7. A document informationcollection apparatus according to claim 4 wherein a unit of retrievalresult by said retrieval means is document management server informationin which the document matching a retrieval condition is stored.
 8. Adocument information collection apparatus according to claim 7 furthercomprising:correlation evaluation means for evaluating a correlationbetween the document and the retrieval condition; and means forcalculating the correlation for each document management server.
 9. Adocument information collection apparatus according to claim 4 furthercomprising:interested item memory means for storing a set of user name,an interested item and a notification address; information monitor meansfor monitoring a document newly stored in said document database todetermine whether a document which matches the interested item isregistered or not; and message notify means for sending a messagedescribing the information of the document when the document whichmatches the interested item is registered, to the notification addressto notify the user.
 10. A document information collection apparatusaccording to claim 9 wherein said message notify means is an electronicmail.
 11. A document information collection apparatus according to claim9 wherein a unit of the interested item is document management serverinformation in which a document matching the retrieval condition isstored.
 12. In a document information collection system having adocument managing apparatus which collects documents therein and aterminal to allow a user to retrieve a document from said documentmanaging apparatus, a relaying apparatus to relay a document which istransferred between said document managing apparatus and said terminal,said relaying apparatus comprising:memory means for storing: (a)delivery information having an item of interest information set thereinwhich is used to see if the document is one which a user desires to bedelivered and (b) a notification condition used to decide whethernotification to user takes place or not, both (a) and (b) being storedin said memory means in association with each other, said memory meansfurther storing information necessary for receiving said transferreddocument, determining to see if said document corresponds to said itemof interest information set in advance and transferring said documentwhen said document corresponds to said item of interest information; anddelivery means for deciding whether said notification condition is metand delivering said document to the user based on said deliveryinformation when said notification condition is met.
 13. A documentinformation collection system according to claim 12, wherein saidrelaying apparatus accepts requests from a plurality of different usersand a plurality of different terminals, and acquires, by a plurality ofdifferent document acquire processes, documents from the documentmanagement servers corresponding to the respective document acquireprocesses and stores the documents in the same document database.
 14. Adocument information collection system according to claim 12, furthercomprising:retrieval means including at least one of attribute retrievalmeans for retrieving an attribute of each document and documentclassification means for classifying the document in accordance with apredetermined rule, wherein a desired document is retrieved from saiddatabase using said retrieval means.
 15. A document informationcollection system according to claim 14, wherein said message comprisesattribute information of the document, document managing apparatusinformation necessary to acquire a text of the document and documentidentifier information and is displayed on said terminal, and when thetext of each document is to be acquired, it is acquired from saidterminal through said relaying apparatus.
 16. A document informationcollection system according to claim 14, wherein a unit of retrievalresult by said retrieval means is document managing apparatusinformation in which the document matching a retrieval condition isstored.
 17. A document information collection system according to claim16, further comprising:correlation evaluation means for evaluating acorrelation between the document and the retrieval condition; and meansfor calculating the correlation for each document managing apparatus.18. A document information collection system according to claim 12,further comprising:interested item memory means for storing a set ofuser name, an interested item and a notification address; informationmonitor means for monitoring a document newly stored in said documentdatabase to determine whether a document which matches the interesteditem is registered or not; and message notify means for sending amessage describing the information of the document when the documentwhich matches the interested item is registered, to the notificationaddress to notify the user.
 19. A document information collection systemaccording to claim 18, wherein said message notify means is anelectronic mail.
 20. A document information collection system accordingto claim 18, wherein a unit of the interested item is document managingapparatus information in which a document matching the retrievalcondition is stored.